The proposed model takes into account the subtopic structures of documents . it first splits the documents into text segments with texttiling and calculates the similarities for different pairs of text segments in the documents . lastly the overall similarity between the documents is returned by combining the similarities of different pairs of text segments with optimal matching method 該模型首先采用texttiling技術(shù)將文檔分割成能代表子主題的文本塊,然后計算兩個文檔中不同文本塊之間的相似度,最后通過圖論中的最優(yōu)匹配方法綜合文本塊之間的相似度得到兩個文檔之間的總體相似度。
text segmentとは意味:テキスト?セグメント text segment meaning:[Computer] < memory > ( Intel 8086 CS) The area of memory containing the machine code instructions of a program . The code segment of a program may be shared between multiple processes ru...